AITopics | parameter matrix

Collaborating Authors

parameter matrix

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reinforcement Learning Finetunes Small Subnetworks in Large Language Models

Neural Information Processing SystemsJun-14-2026, 02:51:42 GMT

Reinforcement learning (RL) yields substantial improvements in large language models' (LLMs) downstream task performance and alignment with human values. Surprisingly, such large gains result from updating only a small subnetwork comprising just 5%-30% of the parameters, with the rest effectively unchanged. We refer to this phenomenon as parameter update sparsity induced by RL. It is observed across all 7 widely-used RL algorithms (e.g., PPO, GRPO, DPO) and all 10 LLMs from different families in our experiments. This sparsity is intrinsic and occurs without any explicit sparsity-promoting regularizations or architectural constraints.

large language model, machine learning, reinforcement learning, (10 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.65)

Add feedback

80b3224c35ad857b951b8bfccfd32fa5-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 04:06:50 GMT

machine learning, natural language, student model, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > Jordan (0.04)
(10 more...)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.93)

Industry: Education (0.72)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

ImprovingDeepLearningOptimizationthrough ConstrainedParameterRegularization

Neural Information Processing SystemsFeb-8-2026, 01:59:18 GMT

Unlike the uniform application of asingle penalty,CPR enforces an upper bound on astatistical measure, suchas theL2-norm, ofindividual parameter matrices.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Vision (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Compression-aware Training of Deep Networks

Jose M. Alvarez, Mathieu Salzmann

Neural Information Processing SystemsNov-21-2025, 13:12:14 GMT

While it has been shown that training such very deep architectures was typically easier than smaller ones [24], it is also well-known that they are highly over-parameterized.

artificial intelligence, machine learning, regularizer, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Los Altos (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

RL Fine-Tuning Heals OOD Forgetting in SFT

Jin, Hangzhan, Luan, Sitao, Lyu, Sicheng, Rabusseau, Guillaume, Rabbany, Reihaneh, Precup, Doina, Hamdaqa, Mohammad

arXiv.org Artificial IntelligenceNov-4-2025

The two-stage fine-tuning paradigm of Supervised Fine-Tuning (SFT) followed by Reinforcement Learning (RL) has empirically shown better reasoning performance than one-stage SFT for the post-training of Large Language Models (LLMs). However, the evolution and mechanism behind the synergy of SFT and RL are still under-explored and inconclusive. In our study, we find the well-known claim "SFT memorizes, RL generalizes" is over-simplified, and discover that: (1) OOD performance peaks at the early stage of SFT and then declines (OOD forgetting), the best SFT checkpoint cannot be captured by training/test loss; (2) the subsequent RL stage does not generate fundamentally better OOD capability, instead it plays an \textbf{OOD restoration} role, recovering the lost reasoning ability during SFT; (3) The recovery ability has boundaries, \ie{} \textbf{if SFT trains for too short or too long, RL cannot recover the lost OOD ability;} (4) To uncover the underlying mechanisms behind the forgetting and restoration process, we employ SVD analysis on parameter matrices, manually edit them, and observe their impacts on model performance. Unlike the common belief that the shift of model capacity mainly results from the changes of singular values, we find that they are actually quite stable throughout fine-tuning. Instead, the OOD behavior strongly correlates with the \textbf{rotation of singular vectors}. Our findings re-identify the roles of SFT and RL in the two-stage fine-tuning and discover the rotation of singular vectors as the key mechanism. %reversing the rotations induced by SFT, which shows recovery from forgetting, whereas imposing the SFT parameter directions onto a RL-tuned model results in performance degradation. Code is available at https://github.com/xiaodanguoguo/RL_Heals_SFT

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2509.12235

Country: North America > Canada > Quebec (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

80b3224c35ad857b951b8bfccfd32fa5-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 07:34:18 GMT

computational linguistic, opdf, student model, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > Jordan (0.04)
(10 more...)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.93)

Industry: Education (0.72)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Improving Deep Learning Optimization through Constrained Parameter Regularization

Neural Information Processing SystemsOct-9-2025, 18:49:11 GMT

Regularization is a critical component in deep learning.

experiment, regularization, weight decay, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States (0.04)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Structured Transforms for Small-Footprint Deep Learning

Vikas Sindhwani, Tara Sainath, Sanjiv Kumar

Neural Information Processing SystemsOct-2-2025, 10:18:54 GMT

We consider the task of building compact deep learning pipelines suitable for deployment on storage and power constrained mobile devices. We propose a unified framework to learn a broad family of structured parameter matrices that are characterized by the notion of low displacement rank. Our structured transforms admit fast function and gradient evaluation, and span a rich range of parameter sharing configurations whose statistical modeling capacity can be explicitly tuned along a continuum from structured to unstructured. Experimental results show that these transforms can significantly accelerate inference and forward/backward passes during training, and offer superior accuracy-compactness-speed tradeoffs in comparison to a number of existing techniques. In keyword spotting applications in mobile speech recognition, our methods are much more effective than standard linear low-rank bottleneck layers and nearly retain the performance of state of the art models, while providing more than 3.5-fold compression.

artificial intelligence, machine learning, matrix, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.34)

Technology: